In this paper, we describe our recent findings in interlinking the ArCo Italian cultural heritage entities to the well known Getty Art and Architecture (GVP) Thesaurus through the automated extraction of candidate entities from textual descriptions and the subsequent pruning of ambiguous out-of-domain entities using Neural Word Sense Disambiguation. The disambiguation task is particularly complex since, as detailed in this paper, we map Italian entities in the Arco cultural heritage onto lexical concepts in English (such as those in the GVP Thesaurus). To date, the majority of entity linking and word sense disambiguation systems are designed to work with English and to operate with general purpose sense inventories and knowledge bases, such as DBpedia, BabelNet and WordNet. To address this challenging entity linking and disambiguation task, we adapted a state-of-the-art Neural Word Sense Disambiguation to work in this multi-language setting. We here describe our adaptation process and discuss preliminary experimental results.
Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage / Faggiani, Erica; Faralli, Stefano; Velardi, Paola. - 1652:Communications in Computer and Information Science(2022), pp. 593-604. (Intervento presentato al convegno European Conference on Advances in Databases and Information Systems tenutosi a Torino, Italy) [10.1007/978-3-031-15743-1_54].
Neural Word Sense Disambiguation to Prune a Large Knowledge Graph of the Italian Cultural Heritage
Faralli, Stefano;Velardi, Paola
2022
Abstract
In this paper, we describe our recent findings in interlinking the ArCo Italian cultural heritage entities to the well known Getty Art and Architecture (GVP) Thesaurus through the automated extraction of candidate entities from textual descriptions and the subsequent pruning of ambiguous out-of-domain entities using Neural Word Sense Disambiguation. The disambiguation task is particularly complex since, as detailed in this paper, we map Italian entities in the Arco cultural heritage onto lexical concepts in English (such as those in the GVP Thesaurus). To date, the majority of entity linking and word sense disambiguation systems are designed to work with English and to operate with general purpose sense inventories and knowledge bases, such as DBpedia, BabelNet and WordNet. To address this challenging entity linking and disambiguation task, we adapted a state-of-the-art Neural Word Sense Disambiguation to work in this multi-language setting. We here describe our adaptation process and discuss preliminary experimental results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.